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Abstract. In this paper, we present a two-stage hybrid 
Kalman filter to estimate both observation and forecast bias 
in hydrologic models, in addition to state variables. The bi- 
ases are estimated using the discrete Kalman filter, and the 
state variables using the ensemble Kalman filter. A key issue 
in this multi-component assimilation scheme is the exact par- 
titioning of the difference between observation and forecasts 
into state, forecast bias and observation bias updates. Here, 
the error covariances of the forecast bias and the unbiased 
states are calculated as constant fractions of the biased state 
error covariance, and the observation bias error covariance 
is a function of the observation prediction error covariance. 
In a series of synthetic experiments, focusing on the assim- 
ilation of discharge into a rainfall-runoff model, it is shown 
that both static and dynamic observation and forecast biases 
can be successfully estimated. The results indicate a strong 
improvement in the estimation of the state variables and re- 
sulting discharge as opposed to the use of a bias-unaware 
ensemble Kalman filter. Furthermore, minimal code modi- 
fication in existing data assimilation software is needed to 
implement the method. The results suggest that a better per- 
formance of data assimilation methods should be possible if 
both forecast and observation biases are taken into account. 


1 Introduction 

During the last decade, data assimilation has been frequently 
applied for the correction of errors in hydrologic model re- 

sults. These errors originate from uncertainties in meteoro- 
logical forcing data, model parameters, formulation of the 


model physics, and initial conditions. A number of methods 
are available for this purpose, of which the most commonly 
used are Newtonian nudging (Stauffer and Seaman, 1990), 
the extended Kalman filter (Welch and Bishop, 1995), the en- 
semble Kalman filter (Evensen, 1994), variational assimila- 
tion (Rabier et al., 2000), and the particle filter (Gordon et al., 
1993). These methods have been applied for the assimilation 
of various variables. Examples of these variables and stud- 
ies that focus on their assimilation are surface soil moisture 
values (Crow and van den Berg, 2010), surface temperatures 
(Meng et al., 2009), brightness temperatures (Seuffert et al., 
2004), radar backscatter values (Hoeben and Troch, 2000), 
snow water equivalent (De Lannoy et al., 2012), snow cover 
fraction (Su et al., 2010), piezometric head data (Chen and 
Zhang, 2006), chemical tracer data (Ng et al., 2009), and dis- 
charge values (Pauwels and De Lannoy, 2009). 

In many studies, observations are used that contain both 
random error and significant bias (Torres et al., 2012). Fur- 
thermore, hydrologic model results do not only contain ran- 
dom errors, but in many cases are also prone to bias (Ashfaq 
et al., 2010). Typically, the above mentioned methods only 
function optimally when the assimilated data and the model 
are free of bias. In order to bypass this inconsistency, a 
number of studies have focused on the removal of sys- 
tematic differences between the assimilated data and the 
model through rescaling the data to the model climatology 
(Reichle and Koster, 2004; Slater and Clark, 2006; De Lan- 
noy et al., 2012). Other studies have focused on the esti- 
mation of the forecast bias in addition to the model state 
variables, using the discrete (Kalman, 1960) and the ensem- 
ble Kalman filter for both linear and nonlinear systems, in a 
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wide range of applications ranging from groundwater mod- 
eling to soil moisture and temperature assimilation (Dee and 
Da Silva, 1998; Dee and Todling, 2000; De Lannoy et al., 
2007; Drecourt et al., 2006; Bosilovich et al., 2007; Reichle 
et al., 2010). Dee (2005) further explains how forecast bias 
can be taken into account in a data assimilation system using 
the Kalman filter or variational assimilation as assimilation 
algorithm. The estimation of observation biases through data 
assimilation has been investigated as well. Derber and Wu 
(1998) present a simple observation bias estimation scheme 
for the assimilation of radiance data into an atmospheric 
model. Auligne et al. (2007) and Dee and Uppsala (2009) 
used a variational approach to estimate satellite data biases, 
while Montzka et al. (2013) used the particle filter for the re- 
trieval of remotely sensed soil moisture biases. Another ap- 
proach, as opposed to state updating and online bias estima- 
tion, is to update model parameters in addition to state vari- 
ables, under the assumption that all forecast bias is caused by 
the model parameters (Moradkhani et al., 2005). 

There have been two major practical approaches for fore- 
cast bias estimation with a Kalman filter: state augmenta- 
tion (Drecourt et al., 2006; Kollat et al., 2011) and separate 
state and bias estimation (Friedland, 1969). Drecourt et al. 
(2006) compared both approaches using a linear groundwa- 
ter model, and concluded that both methodologies outper- 
formed the Kalman Filter without bias estimation. Two-stage 
state and bias estimation, referred to as bias-aware Kalman 
filtering by Drecourt et al. (2006), is an attractive approach 
where the state and the forecast bias are estimated individ- 
ually. Although it is clearly demonstrated that, in the pres- 
ence of forecast bias, this methodology outperforms the es- 
timation of the model state alone (Drecourt et al., 2006; 
De Lannoy et al., 2007), observations are assumed to be un- 
biased in these studies. Furthermore, we are not aware of as- 
similation approaches in hydrological studies that estimate 
both observation and forecast bias, in addition to state vari- 
ables. The objective of this paper is therefore to develop a 
methodology, based on the ensemble Kalman filter (EnKF), 
to estimate observation and forecast biases, as well as model 
state variables. More specifically, the methodology of Dee 
and Da Silva (1998), in which two Kalman filters are applied, 
is expanded to include observation biases as well. The major 
assumption of the proposed methodology, as opposed to state 
augmentation, is that the observation and forecast bias errors 
are independent of each other and of the errors in the unbi- 
ased model state variables. This assumption needs to be made 
in order to enable the derivation of a separate state and bias 
update equation. In this paper, we will demonstrate that, de- 
spite this assumption, reasonable results can be obtained. The 
equations for the estimation of the biases and the state vari- 
ables are derived for a linear system, after which the applica- 
tion for nonlinear systems in an ensemble framework is ex- 
plained. The method is then applied to a very simple rainfall- 
runoff model, into which discharge values are assimilated, 
first in a well-controlled synthetic experiment, and then in a 


real-world example. The performance of the new methodol- 
ogy is then analyzed in detail, and the possibilities for joint 
observation and forecast bias and model state estimation are 
assessed. 

2 Derivation for a linear system 
2.1 System description 

The equations for the simultaneous estimation of system 
states and both forecast and observation biases will first 
be derived for a linear system. The application of the ana- 
lytical expressions in an ensemble framework will then be 
explained. 

In the bias-aware Kalman filter, the state of the system is 
propagated from time step k — 1 to time step k: 

Xk = A,t_ix,fc_i + Bi_i f k -\ + Wk-\. (1) 

Xk is the biased state vector, and fk-i is the vector with 
model inputs (for example the meteorologic input data). 
Wk-i is the model error, which is a random error term with 
covariance Qk-i- Aa_i and B; i are model matrices propa- 
gating states and forcings at time step A — 1 to states at time 
k. For the remainder of the paper, variables indicated with 
a [.] refer to biased variables. The difference between this 
equation and the equation used in the derivation of the dis- 
crete Kalman filter (Kalman, 1960) is that, here, biased state 
variables are used instead of unbiased variables. 

The unbiased state vector x k is defined as 

x k =x k -b (2) 

b'J! is the forecast bias. The system is observed as follows: 

yk = K k Xk + v k + b° k . (3) 

5>a is the vector with the biased observations. Vk is the zero 
mean (unbiased) observation error with covariance R&. Ha is 
the observation matrix, and b° k is the observation bias. This 
equation differs from the equation used in the bias-free dis- 
crete Kalman filter through the observation bias term. The 
unbiased observations can be calculated as follows: 

yk = h - b° k . (4) 

The bias vectors are propagated as follows: 

um _ urn 

k ~ k ~ l (5) 

h° — h° ' 

These equations imply that, if no data are assimilated into the 
model, the bias at the next time step is simply assumed to be 
the same as in the previous time step. 

In the derivation of the two-stage state and bias update 
equations, it is important to remark that the errors in the 
observation and forecast biases are assumed independent of 
each other and of the error in the unbiased state of the system. 
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2.2 Propagation of the states and biases 


2.4 Definition of the error covariances 


In a simulation framework, an estimate of the system state is 
used in the model. For the remainder of this paper, estimated 
variables are denoted with an [.]. [.]“ indicates an a priori 
estimate (forecast, before the update), and [.] + an a posteriori 
estimate (analysis, after the update). The a posteriori state 
estimate at time step A: — 1 is propagated to time step k as 
follows: 

** = ( 6 ) 

The system is observed as follows: 

r k =n k x- k +b° k ~. (?) 


In order to derive an expression for the Kalman gains, a num- 
ber of error covariances need to be defined. The error covari- 
ances of the unbiased states, forecast biases and observation 
biases can be written as 


i =e 

(**-**) (** -*t) r 


T = e 

1 

*r- C 
1 
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+ 
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( 11 ) 


Similarly, the error covariance of the biased states can be 
written as 


The bias estimates are also propagated: 



2.3 Update of the states and biases 


( 8 ) 


The observations are used as follows to update the states and 
biases: 


K + = K~ + K (h - K~ - ^ (i* - K~)) 

K + = K~ + K 0* - K~ - Hr (5 - *?-)) • (9) 

K = ** - K + + Kr 0* - b°+ - Hr (i* - b m k + )) 


K'! ! , K°, and K* are the Kalman gains for the forecast bias, 
the observation bias, and the system states, respectively. The 
filter thus works in two steps. First, the a priori bias and state 
estimates are used to update the observation and forecast bias 
estimates. The update is performed using the unbiased esti- 
mates: the observation bias is subtracted from the observa- 
tions, and the observation matrix is applied to the unbiased 
state vector. The a posteriori forecast bias estimate b’ k + is 
then used to calculate the a priori unbiased state estimate, 
which is defined as 





( 12 ) 


2.5 The equations for a bias-aware linear system 


Appendix B provides the details of the derivation of the equa- 
tions for the application of the bias-aware Kalman filter. Fig- 
ure 1 shows a schematic of the equations that need to be 
applied. The state and bias vectors in step 1 can be initial- 
ized with any appropriate prior guess, i.e., typically a spun 
up biased state estimate for x^_ l , and zero estimates for the 
biases. In other words, the model can be applied repeatedly 
using the same forcings for one or a number of years, until 
the state variables of the first time step converge to a specific 
value. 


2.6 Interpretation of the expressions for the Kalman 
gains 


Step 3 and step 5 in Fig. 1 list the expressions for the three 
Kalman gains. These expressions can be compared to the ex- 
pression for the Kalman gain for a linear, bias-unaware (or 
unbiased) system: 


K = P.- H[ [h,p-h[ + r, 


(13) 


This definition, in combination with the a posteriori obser- 
vation bias estimates, is used to update the unbiased state 
estimates. It should be noted that the biased state is fed back 
into the model. In other words, Eq. (10) is an output equa- 
tion, providing the unbiased state estimates, and is no part of 
the model integration. An analytical expression for the three 
Kalman gains is thus needed. It is relatively straightforward 
to prove that the last update in Eq. (9) leads to unbiased state 
estimates (Appendix A). 


In the above derived expressions, firstly the observation bias 
error covariance appears. For the bias estimation, the a pri- 
ori bias error covariances are used, while for the state update 
Kalman gain the a posteriori error covariance is needed. This 
is a logical consequence of Eq. (9), where the a priori ob- 
servation bias is subtracted from the biased observations in 
the innovations (the difference between the unbiased obser- 
vations and simulations), while the a posteriori bias estima- 
tion is used in the state update. 

Further, for the bias Kalman gains (step 3 in Fig. 1), both 
H k P' k ~ HjT and FU- H J appear in the denominator, while 
for the state Kalman gain (step 5 in Fig. 1) the denominator 
contains FU-P^ H J_ . This can again be explained by the up- 
date equations (Eq. 9). For the bias updates, FU ix k — b' k ~ ) 
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1) Propagation of the state and bias estimates 


f x ‘ 


K~ 

- K m+ 
_ D fc-1 

l K~ 

— b° + 

— U k-1 


2) Propagation of the error covariances 


f P, 

= A k + PJT + Qfc-1 

\ PT 

_ -po+ 

— r k- 1 

l p r 

_ pm+ 

' r k- 1 


3) Calculation of the bias Kalman gains 


= p° k ~ [H fe (p,-+pr)H£+pr+R fc ] 

Kf = -PrH£[H fc (p fc -+Pr)H£ + p r + R fc ] 


4) Updating of the bias error covariances 


f Pr + = 

[I — Kfc] Pfc 

i = 

[i+K-H*]pr 


5) Calculation of the state Kalman gain 


K* = P~ k n T k [H*P*H? + P ? + Rfc] _1 


6) Updating of the state error covariance 


P+ = [I-K fc H fe ]P,7 


7) Updating of the bias estimates 


1 

' hr = br+Kj* 

> k + = br+Kj( 

(y* - K~ ~ 

y, - b°- - H, ( 

(k - tr) 


8) Updating of the state estimates 


** = k - K + + K fc 

(y* - K + ~ H/ 




Fig. 1. Schematic of the methodology for a discrete Kalman filter. 


is substracted from the observations. x ; 7 — b™~ is defined as 
the a priori estimate of the unbiased state, with an error co- 
variance matrix equal to Pj“ — P™~. Both of these error co- 
variances are used in the Kalman gain. For the system state 
update, x k appears in the innovation, and the unbiased error 
covariance P^~ is thus used (step 5 in Fig. 1). 

As a summary, the Kalman gain takes into account the un- 
certainty of all the terms in the update equation, which ex- 
plains the different terms in the denominator for the three 
Kalman gain expressions. 

A final remark is the minus that appears in the expression 
for the forecast bias Kalman gain. This is a direct result from 
the derivation shown in Appendix B. This can be explained 
by the definition of the forecast bias (Eq. 2) and the obser- 
vation bias (Eq. 4). A positive forecast bias means that the 


biased state is larger than the unbiased state. A similar re- 
mark can be made regarding the observation bias. Assume 
a model with positive forecast and observation biases. Fur- 
ther, assume a positive observation system (in other words, 
all nonzero entries in are positive). This means that an 
increase in the state variables will lead to an increase in the 
observation. Assume also that the unbiased observation pre- 
dictions are smaller than the actual unbiased observations 
(meaning that the expression between brackets in Eq. (9) is 
positive). This may imply that either the biased system state 
is underestimated, or the forecast bias is overestimated, or 
the observation bias is underestimated, or a combination of 
these possibilities. The possible overestimation of the fore- 
cast bias explains the minus in the expression for the forecast 
bias Kalman gain. 

2.7 Interpretation of the method 

The objective of this method is to separate the mismatch be- 
tween the observations and the model results into forecast 
and observation bias, and random model and observation er- 
ror. This is an additional difficulty as compared to the bias- 
unaware KF, where this mismatch is separated into random 
model and observation error. The Kalman gain (Eq. 13) can 
be interpreted as the fraction of this mismatch that is assigned 
to the model noise, and maps this mismatch from observa- 
tion space onto state-space through the observation opera- 
tor 1HE. A similar reasoning can be made for the bias-aware 
KF. The Kalman gains K™, K° k (step 3 in Fig. 1), and K/.- 
(step 5 in Fig. 1) indicate the fraction of the mismatch that 
can be attributed to the forecast bias, the observation bias, 
and the random forecast error, respectively. For the forecast 
bias and state estimates, these Kalman gains remap the dif- 
ference between the unbiased observations and the unbiased 
simulations thereof to state-space. 

Step 2 in Fig. 1 also shows that the propagation of the 
prior unbiased state error covariance contains an extra term 
as compared to the propagation for a bias-unaware (or unbi- 
ased) linear system: 

Pj =A*_ 1 P+_ 1 Aj_ 1 +Q*_ 1 . (14) 

More specifically, the a posteriori forecast bias error covari- 
ance needs to be added to the forecast error covariance (the 
term P"' + ). This can be explained by the update equations. 
Essentially, Eq. (1) shows that in the system the biased state 
vector is propagated. The propagation of the biased state er- 
ror covariance thus appears in step 2 in Fig. 1. However, in 
the calculation of the unbiased state, the forecast bias is sub- 
tracted. This implies that the unbiased state error will consist 
of the error in the biased state and the forecast bias, which 
explains the extra term in step 2 in Fig. 1. The definition of 
the unbiased state forecast (or prior state) in Eq. (10) explains 
why the posterior estimate of the forecast bias error covari- 
ance is used in step 2 in Fig. 1. 
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3 Application to nonlinear systems 

3.1 General approach 

The equations in Sect. 2.5 can easily be modified for an ap- 
plication in a nonlinear system. In this respect, a distinction 
must be made between the system model and the bias mod- 
els. In Eq. (5) persistent (and thus linear) bias models are 
used, while the system matrix A*_ i is replaced by a nonlin- 
ear model. One logical way to apply the bias-aware Kalman 
filter is thus to have a mix between an EnKF for the system 
state, and a discrete KF for the biases. 

The use of a persistent bias model is by itself an argument 
for separate bias-estimation with a discrete KF. Because of 
the persistent nature of the bias model, an initial spread in 
the ensemble of biases would remain unaltered during fore- 
cast periods. This can be explained by the fact that during the 
forecast step no observations are used to update the biases, 
which do not change between time steps. Furthermore, the 
spread in the bias ensemble would decrease at each analysis 
step. This is a typical property of the EnKF: the analysis step 
causes the state variables (and also the biases) to merge to a 
specific value. Because of the different forcings and model 
parameters, the state variables will then again evolve to dif- 
ferent values. However, since a persistent bias model is used, 
this evolution will not occur for the biases. This would cause 
filter divergence, unless some artificial inflation would be ap- 
plied, would likely lead to inconsistencies between the state 
and bias estimates. Here, we will leverage off the ensemble 
state error covariance to approximate the bias error covari- 
ance estimate for the discrete KF. 

It should be noted that the mixed approach (an EnKF 
for the system states and a discrete KF for the biases) has 
been applied in several studies focusing on bias estimation 
(De Lannoy et al., 2007). This approach is thus extended here 
for the inclusion of observation biases. 

3.2 Two-stage state and bias estimation versus state 
augmentation 

A two-stage filter has a number of advantages over state aug- 
mentation. Firstly, the dimensions of the state vector do not 
increase. For a small number of state variables this may not 
be important, but for large systems this can be a considerable 
advantage. The calculation of the forecast error covariance 
requires n 2 ■ n e calculations (with n s the number of state vari- 
ables and the number of ensemble members). If the biases 
are added to the state vector, the calculation of P^T would re- 
quire (2 n s + n 0 ) 2 ■ « e calculations (with n 0 the number of 
observations). The increase in the required number of cal- 
culations thus evolves approximately quadratically with the 
number of state variables, which can be a significant draw- 
back for large systems. It should be noted that in the applica- 
tion of the EnKF the forecast error covariance does not need 
to be calculated explicitly (Reichle et al., 2002). However, 


the cross-covariance between the system states and the ob- 
servation simulations needs to be calculated, and a similar 
reasoning can be made. Another advantage is that, if a model 
already contains a bias-unaware EnKF, the bias-aware filter 
equations show that minimal code modification is needed to 
include the bias estimates. 

The separate bias and state estimation is possible through 
the assumption of uncorrelated state and bias errors. State 
augmentation could take these correlations into account, but 
is computationally more expensive through the calculation of 
the cross-covariance between the system states and the obser- 
vation simulations. 

3.3 Estimation of the error covariances 


Step 3 and step 5 in Fig. 1 show that a number of error covari- 
ances are needed: P^, P k , P ' k ~ , P^ - , and P', )+ . These error 
covariances determine the partitioning of the difference be- 
tween the observations and forecasts into the different error 
and bias components. However, it is not straightforward to 
optimally estimate each of these error covariances. Here, we 
describe some assumptions made for this paper. The biased 
state error covariance is given by 

h - p r- as) 

In order to estimate the forecast bias error covariance, one 
thus needs to know the unbiased and the biased state error 
covariances. In this paper, we assume that the unbiased state 
error covariance is a specified fraction of the biased state er- 
ror covariance. This is an approach that is used in many pa- 
pers focusing on the estimation of biases through data assim- 
ilation, including Dee and Da Silva (1998), Drecourt et al. 
(2006), and De Lannoy et al. (2007). Calculating the biased 
state error covariance using the ensemble results, we can thus 
write 


JP*“ = Y P* 

l p r = (i - Y )Pk ' 


(16) 


y is a filter parameter, between zero and one, which can be 
obtained through calibration. A value of zero indicates that 
the entire model error is assumed to be caused by bias, while 
a value of one indicates that noise is the only cause of errors 
in the model results. 

The observation bias error covariances can be estimated 
under the assumption that a more uncertain observation pre- 
diction is accompanied by a more uncertain observation bias. 
For this reason, we estimate P° k ~ as a function of the error co- 
variance of the observation predictions: 

Pr=*H,p-H[. (17) 


k is a filter parameter and can be estimated through calibra- 
tion. Determining a typical value for this parameter is not 
straightforward, as it will depend on the magnitude of the 
different state variables and observations. 
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y and k can thus be calibrated in order to optimize the 
filter performance. If the biases are correctly estimated, the 
innovations in the update equation (step 8 in Fig. 1) should 
consist of white noise since the bias is removed from both 
the observations and the simulations. This means that the au- 
tocorrelation length of these zero-mean innovations should 
be zero. Both parameters can thus be tuned in order to reach 
this zero value. If the system and observations are unbiased, 
zero values for both these parameters should be obtained. To 
summarize, the values for y and k are modified until the au- 
tocorrelation length of the innovations is equal to zero. An 
example of this filter parameter tuning is demonstrated in 
Sect. 7.2. 


3.4 Summary 


In summary, the bias-aware EnKF can be applied as follows. 
First, the states and biases are propagated from time step 
k — 1 to time step k: 


fk,k - 1 (it-i. «4-i) 


L/n- 


K~ 


h>"+ 
°k - 1 

b° + 

°k - 1 


(18) 


fk,k—\ (•) is a nonlinear operator representing the model in 
state-space, including the model parameters and the meteoro- 
logical forcings, i is the ensemble member number, and 
is a realization of the model error, which can be obtained by a 
perturbation of the model parameters, state variables, and/or 
meteorological forcings. 

The next step is then the calculation of P7 using the en- 
semble results: 



At „. 

*k = 7/ £ * l k 


i = 1 


(19) 


Using the calculated Kalman gains and observation bias er- 
ror covariance, the state variables and the biases can then be 
updated: 

k + = b'r + k (n - k - **(*r-*r ). =1 _ N ) 

K + = K~ + (yt - K~ - hk (i‘ k - - b'r) i=l J ■ ( 21 ) 
= K - K + + k * (h - K + - bk (*r - k + ) + »i) 

h k (.) is a nonlinear function, relating the state variables to 
the observations. Note that the two bias update equations are 
deterministic, while the third state update equation is per- 
formed for each indi vidual ensemble member , i is the en- 
semble member, and hk (x'jT — j indicates the 

average simulated observations, calculated across the ensem- 
ble. v l k is a random realization of the observation error, and 
is needed in order to ensure a sufficient ensemble spread 
(Burgers et al., 1998). 


4 Evaluation of the methodology 

The derived equations are tested through a synthetic study. 
A very simple rainfall-runoff model is first calibrated for 
the Zwalm catchment in Belgium. The obtained parameters 
are then used to generate discharge and storage values. The 
synthetically true storages are obtained by adding a prede- 
fined bias to the modeled storage values (which is consistent 
with Eq. 2). The synthetic discharge observations are then 
obtained by adding a predefined observation bias to the dis- 
charge, obtained with these biased storages. Furthermore, a 
random-error term with a predefined standard deviation is 
added to the synthetic discharge observations as well. This 
is consistent with Eq. (3). The synthetic observations are 
then assimilated into the model, and the retrieved storages 
and discharges can then be compared to the synthetic truth 
in order to evaluate the performance of the data assimilation 
algorithm. 


N is the number of ensemble members. P, and Pf' are 
then calculated using Eq. (16), and P^ _ is calculated us- 
ing Eq. (17). The three Kalman gains are then calculated in 
step 3 and step 5 in Fig. 1. Following Reichle et al. (2002), 
P “ H[ can be calculated as the covariance between the un- 
biased state and the measurement predictions, and H* P^T H j? 
as the covariance of the unbiased measurement predictions. 
Analogously, P^ H 2 and H^. FjT can be calculated using 
the biased model results. This implies that P^T and 1^7 do not 
need to be calculated explicitly. 

Before calculating the state Kalman gain, P^~ needs to be 
updated to P"" since this is needed in the expression (step 5 
in Fig. 1). This is performed by 

Pf = p r [! - K\ ■ (20) 


5 Site and data description 

The study is performed in the Zwalm catchment in Belgium. 
Troch et al. (1993) provide a complete description of this 
test site; only a very short overview is given here. The to- 
tal drainage area of the catchment is 1 14.3 km 2 and the total 
length of the perennial channels is 177 km. The maximum 
elevation difference is 150 m. The average annual tempera- 
ture is 10 °C, with January the coldest month (mean temper- 
ature 3 °C) and July the wannest month (mean temperature 
18 °C). The average annual rainfall is 775 mm and is dis- 
tributed evenly throughout the year. The annual actual evap- 
otranspiration is approximately 450 mm. 

Meteorological forcing data with a one-day time step 
from 1994 through 2002 were used in this study. The 
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precipitation and all the variables needed to calculate the po- 
tential evapotranspiration using the Penman-Monteith equa- 
tion were measured by the Belgium meteorological station in 
Kruishoutem. Discharge was measured continuously at the 
catchment outlet in Nederzwalm. 

6 Model description 

The Hydrologiska Byrans Vattenbalansavdelning (HBV) 
model, of which Fig. 2 shows a schematic, was originally 
developed by Linstrom et al. (1997). In this paper, the ver- 
sion of Matgen et al. (2006) is applied. The model uses ob- 
served precipitation (Rtot(O) and potential evapotranspira- 
tion (ETP(f)) as input, both in ms -1 , t is the time in seconds. 
The catchment is divided into a soil reservoir, a fast reservoir, 
and a slow reservoir. There are thus three state variables: the 
amount of water in the soil reservoir ( S{t ), m), the slow reser- 
voir ( ,S'i (?), m), and the fast reservoir (S 2 (t), m). 

A number of fluxes are calculated, which depend on the 
state variables of the system. The actual evapotranspiration 
ETR(f) (m 3 s -1 ) is first determined: 

1 S(t ) 

ETR(f) = — ETP(f). (22) 

7 5max 

A is a dimensionless parameter, and .S max is the storage capac- 
ity of the soil reservoir (m). The infiltration Ri n (t) (ms -1 ) is 
calculated as follows: 

Rm(f) = (\ - Rtot(f). (23) 

\ *Jmax / 

b is a dimensionless parameter. After this, the effective pre- 
cipitation R e ff(f) (ms -1 ) is determined: 

R eS (t) = RtotCO - *in(f). (24) 

The calculation of the percolation D(t ) (ms -1 ) is then 
perfonned: 

D(t) = Pe (l - . (25) 

Pe is a percolation parameter (ms -1 ), and (3 a dimensionless 
parameter. After this, the storage in the soil reservoir at the 
end of the time step can be calculated as follows: 

S(t + At) = S(t ) + (Rin(0 — ETR(f) — Perc(f)) At. (26) 

At is the time step in seconds. S(t + At) is always positive 
after model calibration. 

The input in the fast reservoir /Aft) (ms -1 ) is then 
S(t) 

Ri(t) = a-^R^(t). (27) 

^max 

a is a dimensionless parameter. The outflow from this reser- 
voir 02 (t) (m 3 s — 1 ) is then determined: 

«0 = ,P)' (28) 

\ ^2,max / 



$ 2 , max is the storage capacity of the fast reservoir (m), and 
k 2 (m 3 s -1 ) and \[r (-) are model parameters. After this, the 
storage in the fast reservoir at the end of the time step can be 
calculated as 

S 2 (t + At) = S 2 (t) + (R 2 (t) - 02(0) A t. (29) 

The input in the slow reservoir R\(t) (ms -1 ) is then 
computed: 

Ri(t) = R eS (t) - R 2 (t). (30) 

The outflow from this reservoir 0i (0 (m 3 s -1 ) can be calcu- 
lated as 

Qx{t) = K l S x {t). (31) 

k\ is a model parameter (m 2 s -1 ). Finally, the storage in the 
slow reservoir at the end of the time step is calculated: 

Si(t + At) = Si(t) + (Riit) - 0i(O + Perc(0) At. (32) 

The total discharge q(t) is simply the sum of Qi(t) and 
02(f)- A triangular unit hydrograph is used for runoff rout- 
ing. Since in this paper daily time steps are used, and the con- 
centration time of the catchment is only 14 h (Ferket et al., 
2008), no routing needs to be performed for this study. 

In summary, the model contains ten time-invariant param- 
eters (X, S max , b, a, P e , P, if, S 2 , max, K 2 and /q), and three 
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Table 1. Properties of the observation and forecast bias in the synthetic experiments. 


Experiment Experiment Forecast bias Forecast bias Observation Observation 

set average (mm) amplitude (mm) bias average bias amplitude 




S 

Si 

s 2 

S 

Si 

s 2 

(m 3 s *) 

(m 3 s ') 

1 

1 

0 

0 

0 

0 

0 

0 

0.5 

0 

1 

2 

20 

0.4 

0.2 

0 

0 

0 

0.5 

0 

1 

3 

20 

0.4 

0.2 

0 

0 

0 

0 

0 

2 

1 

0 

0 

0 

10 

0.2 

0.1 

0.5 

0.25 

2 

2 

20 

0.4 

0.2 

10 

0.2 

0.1 

0.5 

0.25 

2 

3 

20 

0.4 

0.2 

10 

0.2 

0.1 

0 

0.25 


state variables per time step (the storages S(t), S\(t ), and 
S 2 (t)). 

The model was calibrated using particle swarm optimiza- 
tion (Kennedy and Eberhart, 1995), using the data from 1994 
through 1997. Table 2 lists the obtained parameter values, 
and Fig. 3 shows the comparison of the modeled to the ob- 
served discharge for the entire simulation period. Based on 
these results, the rainfall-runoff dynamics of the model are 
deemed to be sufficiently accurate to be used in a data assim- 
ilation study. 

7 Results 

7.1 Ensemble generation 

In order to thoroughly evaluate the performance of the filter, 
three experiments are performed. Table 1 shows an overview 
of these experiments. The first experiment considers only ob- 
servation bias and noise, and no forecast bias. In order to 
generate the synthetic truth, no bias is added to the storages, 
and a bias of 0.5 m 3 s -1 and a random error with a standard 
deviation of 0.1m 3 s _1 are added to the synthetically true 
discharge. These synthetic observations are then assimilated 
into the model with an assimilation interval of 7 days. 

The second experiment considers observation bias and 
noise as well as forecast bias. Again, a bias of 0.5 m 3 s _ 1 and 
a random error with a standard deviation of 0.1 m 3 s _I are 
added to the synthetically true discharge. However, in order 
to generate the synthetic truth, bias is added to the storages. 
Forecast bias is generated this way, in a manner consistent 
with the filter theory, so that one can assess to what extent 
the bias-aware EnKF will correctly estimate the true storage 
and discharge values. The value of the bias added to the stor- 
ages is obtained by examining the standard deviation in the 
modeled storages, obtained using the calibrated model pa- 
rameters. The bias was then assumed to be 10 % of this stan- 
dard deviation. This resulted in a bias of 20, 0.4, and 0.2 mm 
for the surface, slow reservoir, and fast reservoir storages, re- 
spectively (see Sect. 6). 

The third experiment considers only observation noise and 
forecast bias. A random error with a standard deviation of 


Table 2. Parameter values used by the hydrologic model. 


Parameter 

Value 

Units 

X 

1.228 

- 

Smax 

0.322 

m 

b 

1.219 

- 

a 

1.512 

- 

Pc 

1.077 x 10“ 8 

ms -1 

p 

1.326 

- 


1.049 

- 

*^2,max 

1.726 x 10“ 2 

m 

K 2 

1.369 x 10“ 7 

ms -1 

K 1 

6.916 x 10“ 7 

s _1 


0. lm 3 s _l is added to the synthetically true discharge, which 
is again obtained by adding a predefined bias to the storage 
values. 

In order to investigate the possibility to estimate a tem- 
porally varying observation and forecast bias, the three ex- 
periments are repeated, but with a sinus wave added to the 
mean biases. The period of this wave is equal to one year and 
the amplitude equal to 0.25 m 3 s -1 for the observation bias, 
and 10, 0.2, and 0.1 mm for the surface, slow reservoir, and 
fast reservoir storages, respectively. 

The experiments are applied with an ensemble of 32 mem- 
bers. The ensemble is generated by adding a Gaussian dis- 
tributed random number with zero mean to each parameter 
value. The standard deviation of this random number is set to 
a fraction of the original parameter value. It was ensured that 
the parameter values did not exceed physical limits. At each 
time step, a random error is added to the observed precipi- 
tation and potential evapotranspiration. Again, the standard 
deviation of the random error is set to a fraction of the orig- 
inal observed value. The fractions to calculate the standard 
deviations are calibrated for each experiment to ensure an ad- 
equate ensemble spread, following De Lannoy et al. (2006). 
Ensemble sizes of 256, 128, 64, 32, and 16 members were 
analyzed, and it was found that for sizes larger than 32 the en- 
semble statistics no longer varied significantly. For this rea- 
son, 32 members were used for the remainder of the analysis. 
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Fig. 3. Evaluation of the modeled discharge. The thick solid lines are the observed discharge, and the thin lines are the model results. 
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Observation 



Fig. 4. Evolution of the estimated biases (the type of bias is indi- 
cated at the top of each plot) for experiment 2 using a multiplica- 
tion factor (k) of 10 (solid lines) and 100 (dotted lines) to calcu- 
late the observation bias error covariance. The top panel indicates 
the observation bias, and the second, third, and bottom panels indi- 
cate the bias in the soil, groundwater, and surface runoff storages, 
respectively. 


7.2 Filter parameter estimation 

Another issue in the application of the bias-aware EnKF is 
the estimation of the parameters y (used in Eq. 16 to esti- 
mate PjT and P™ _ ) and k (used in Eq. 17). The objective of 
the determination of these parameters is to obtain stable bias 
estimates, meaning that the estimates do not continue to de- 
crease or increase as the simulation proceeds, y is clearly a 
value between zero and one, while an indicative value for k is 
more difficult to determine. It has been found that a too low 
value for k leads to observation bias estimates that diverge 
from the true value during the simulation. Figure 4 shows 
this for experiment 2 (a constant observation bias). The ob- 
servation bias and the biases for S and Si evolve to clearly 
unrealistic values. A value of 100 for k can be seen to lead 


Autocorrelation function innovations 



Fig. 5. Ensemble average of the autocorrelation function of the in- 
novations in step 8 of Fig. 1 for each ensemble member, for the case 
of sinusoidal observation and forecast biases and k equal to 100. 


to stable observation bias estimates. The results of the filter 
have been found to be less sensitive to the value of y . A value 
of 0. 1 for this parameter has been found to lead to stable esti- 
mates of the forecast bias. As explained in Sect. 3.3, the auto- 
correlation length of the innovations in step 8 in Fig. 1 should 
be zero with a zero mean. Figure 5 shows the ensemble aver- 
age of the autocorrelation functions for each ensemble mem- 
ber for the case of sinusoidal observation and forecast bi- 
ases. Clearly, for all lags larger than 0, the autocorrelations 
approach zero, which indicates that y has been adequately 
estimated. 

7.3 State and bias estimation for a constant observation 
and forecast bias 

Figure 6 shows the estimated storages for each of the exper- 
iments described above, using a constant observation bias. 
Figure 7 shows the analysis of the modeled discharges and 
the evolution of the estimated observation bias. For each of 
the three experiments, the estimated unbiased storages and 
discharges are compared to the storages obtained using a 
bias-unaware EnKF in which only states are estimated and 
biases are not taken into account. Figure 8 shows the time 
series of the true discharge, the results of the baseline run, 
and the results of the bias-aware EnKF for a selected period 
in the simulation period. 

Examining the results of experiment 1 (only observation 
bias), the advantage of the bias estimation is evident. The 
bias-unaware EnKF leads to relatively large errors in the stor- 
ages, which practically disappear when bias is taken into ac- 
count. Figure 7 shows that taking into account the observa- 
tion bias also leads to an almost perfect estimate of the dis- 
charge. The observation bias shows fluctuations around the 
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S experiment 1 bios unowore 


SI experiment 1 bios unowore 


S2 experiment 1 bios unowore 



S experiment 1 bios aware 



S experiment 2 bios owore 



X : 117.838 
Y : 97.812 
Slop# : 1 .007 
intercept : -20.9( 9 
R : 0.998 
RMSE : 20.134 


50 100 150 

True value (mm) 


S experiment 3 bios unowore 



50 100 150 

True value (mm) 


S experiment 3 bias aware 




SI experiment 1 bias aware 



SI experiment 2 bios unowore 



Mean X : 11.519 
Mean Y : 14.043 
Slope : 0.884 
Intercept : 3.833 
R : 0.930 
RMSE : 2.835 


5 10 15 20 25 

True volue (mm) 


SI experiment 2 bios owore 



SI experiment 3 bios unowore 



SI experiment 3 bias aware 




S2 experiment 1 bios aware 



S2 experiment 2 bios unaware 



S2 experiment 2 bios owore 



S2 experiment 3 bios unowore 



S2 experiment 3 bias aware 



Fig. 6. Comparison of the estimated storages obtained using the bias-aware and a bias-unaware EnKF for the constant observation bias 
experiment. The dashed lines are the regression lines; the solid lines the 1 : line. Synthetic true values are in x axis; estimations in y axis. 
The top, middle, and bottom two rows show the results of experiments 1, 2, and 3, respectively. The first, second, and third column show the 
results for S , S and S 2 , respectively. 
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mean value, but the true value of 0.5 m 3 s _1 is retrieved with 
reasonable accuracy. 

Similar conclusions, but to a lesser extent, can be drawn 
for the results of experiment 2 (both observation and forecast 
bias). Figure 6 shows that the estimation for the three stor- 
ages is better for the bias-aware EnKF as compared to the 
bias-unaware EnKF. For 5) and 5S an almost perfect estimate 
is obtained, while for S the bias is reduced but not eliminated. 
However, as S does not directly influence the discharge (see 
the equations in Sect. 6), this should not lead to errors in 
the discharge estimation. Figure 7 shows that indeed the dis- 
charge is almost perfectly estimated, and that the estimated 
bias fluctuates around the correct value. 

Regarding experiment 3 (only forecast bias), the results 
from the bias-aware EnKF are better than for the bias- 
unaware EnKF. Figure 6 shows that for S the improvement 
is marginal, but for S\ and S 2 a reduction in the RMSE (root 
mean square error) can be observed, while the bias in the es- 
timates is almost unaltered. Figure 7 shows that the discharge 
is almost perfectly estimated, and that the estimation of the 
observation bias again fluctuates around the correct value. In 
experiments 2 and 3, S remains biased due to an observabil- 
ity issue: the observed (assimilated) discharge is not directly 
related to S, and hence this information cannot be efficiently 
used to update the bias estimate. 

Further examination of Figs. 6 and 7 shows that the es- 
timated storages and discharges obtained using the bias- 
aware EnKF are almost identical for experiment 2 and exper- 
iment 3. This can be explained by the relatively accurate esti- 
mation of the observation bias. Since the bias-unaware EnKF 
does not take this into account, the results for experiment 3 
(where no observation bias is present) are significantly better 
than the results for experiment 2 for the bias-unaware EnKF. 

Table 3 shows the relative RMSE increase for both the 
bias-aware and the bias-unaware EnKF. This relative increase 
is defined as 


RI = 100 


RMSE a - RMSE b 
RMSEb 


(33) 


Rl is the relative increase (%), RMSE a the RMSE of the as- 
similation run, and RMSEb the RMSE of the baseline run 
(the run in which no data are assimilated). The RMSE values 
are calculated both for the state variables and the discharges. 
A positive value indicates an increase in RMSE as compared 
to the baseline run, a negative value a decrease, if observation 
and forecast bias are not taken into account, the analyzed dis- 
charge tends to be better estimated than for the baseline run, 
but this does not necessarily mean that the storages are bet- 
ter estimated. In other words, better discharge forecasts do 
not necessarily lead to better estimates of state variables. On 
the other hand, when biases are taken into account, the state 
variables are better estimated as compared to the baseline run 
(except for S). This also leads to better discharge estimates 
as compared to the results of a bias-unaware EnKF. 


Table 3. Relative RMSE values of the analyzed results from the 
bias-unaware and the bias-aware EnKF. 


Experiment Variable Bias-unaware Bias-aware 

EnKF EnKF 


Constant observation and forecast bias 


1 

S 

394.76 

- 81.95 

1 

Si 

324.48 

- 71.18 

1 

s 2 

- 65.77 

- 95.41 

1 

Q 

- 23.36 

- 92.75 

2 

s 

68.42 

0.71 

2 

Si 

257.35 

- 39.42 

2 

S 2 

- 64.62 

- 92.11 

2 

Q 

- 17.3 

- 85.43 

3 

s 

2.81 

0.69 

3 

Si 

- 20.35 

- 39.29 

3 

S 2 

- 89.4 

- 92.11 

3 

Q 

- 0.49 

- 32.15 


Sinusoidal observation and forecast bias 


1 

S 

96.31 

- 15.93 

1 

Si 

298.96 

- 52.93 

1 

s 2 

- 80.7 

- 88.31 

1 

Q 

- 40.36 

- 78.02 

2 

s 

52.95 

2.27 

2 

Si 

219.74 

- 25.42 

2 

S 2 

- 79.17 

- 86.26 

2 

Q 

- 34.01 

- 74.03 

3 

s 

- 9.77 

2.24 

3 

Si 

35.67 

- 25.2 

3 

S 2 

- 72.43 

- 86.26 

3 

Q 

27.73 

- 33.16 


7.4 State and bias estimation for a temporally variable 
observation and forecast bias 

Figure 9 shows the comparison of the modeled storages to 
the synthetic truth for the three experiments with a sinu- 
soidally evolving observation and forecast bias. This sinu- 
soidal bias was chosen since discharge observations tend to 
show a seasonal cycle. The sinusoidal bias may be an exag- 
gerated simplification of the true bias, but it is still more re- 
alistic than a constant bias. Figure 10 shows the comparison 
of the modeled discharge to the synthetic truth and the evo- 
lution of the observation bias. The same conclusions can be 
drawn for the experiments with a constant observation bias. 
In all cases the observation bias and the discharge are better 
estimated with a bias-aware EnKF, as compared to when a 
bias-unaware EnKF is used. The estimation of the storages 
also strongly improves when a bias-aware EnKF is used, as 
opposed to the use of the bias-unaware EnKF. The only ex- 
ception is the estimation of S for experiment 3, where the 
bias and the RMSE slightly worsen. However, this storage 
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Q experiment 1 bias aware 


Observation bias experiment 1 



True value (m 3 s~') 

Q experiment 2 bias unaware 



True value (m 3 s ') 




True value (m 3 s ') 

Q experiment 2 bias aware 



True value (m 3 s ') 

Q experiment 3 bias aware 






Fig. 7. Comparison of the estimated discharges obtained using the bias-aware and a bias-unaware EnKF, and the evolution of the observation 
bias throughout the simulation - both for the constant observation bias experiment. The dashed lines are the regression lines; the solid lines 
the 1 : 1 line. For the bias estimation, the dotted lines refer to the true bias. 


does not directly influence the discharge, which explains the 
almost perfect estimation of the discharge. 

Similar for a constant observation bias, the results from 
experiment 2 and experiment 3 are almost identical when a 
bias-aware EnKF is used, but not when a bias-unaware EnKF 
is used. This can be explained by the relatively accurate esti- 
mation of the observation bias. 

Table 3 shows that, for the comparison of the results to 
the results of the baseline run, the same conclusions can be 
drawn for temporally constant observation and forecast bi- 
ases. If biases are not taken into account, slightly better dis- 
charge estimates than the baseline run are obtained, with not 
necessarily better state estimates (except for experiment 3 
where the discharge estimates are also worsened). Taking 


into account the biases again leads to better state estimates 
(except for the storage S). 

7.5 Benefit of dual observation-forecast bias estimation 

In order to demonstrate the benefit of the estimation of both 
forecast and observation biases, as opposed to the estima- 
tion of forecast biases alone (Dee and Da Silva, 1998; Dee 
and Todling, 2000; Drecourt et al., 2006; De Lannoy et al., 
2007; Bosilovich et al., 2007; Reichle et al., 2010), the ex- 
periments described above were repeated, but the estimation 
of the observation bias was turned off. Under these condi- 
tions, the state and forecast bias estimation reduces to the 
methodology described in De Lannoy et al. (2007). Table 4 
shows the results of the comparison of the analyzed storages 
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Fig. 8. Comparison of the estimated discharge from the baseline 
run and the assimilation run with the bias-aware EnKF to the true 
discharge for a selected part of the simulation period. 

and discharges to the synthetic truth. These statistics can be 
compared to the statistics in Figs. 6, 7, 9, and 10 in order to 
assess whether the estimation of both forecast and observa- 
tion biases leads to better results than the estimation of fore- 
cast bias alone. 

In all cases, as expected, the incorporation of an observa- 
tion bias estimation leads to a better estimate of the states and 
discharges. This is expressed by the lower bias and RMSE 
values in Figs. 6, 7, 9, and 10 than in Table 4. These results 
thus clearly show the benefit of a dual state and observation 
and forecast bias estimation. 

7.6 Analysis of the ensemble spread 

A very important aspect of the application of the Kalman 
filter is to ensure an adequate ensemble spread. Figure 11 
shows the square root of the diagonal elements of the biased 
a priori error covariance matrix (thus the ensemble spread) 
and the observation error covariance for the case with a si- 
nusoidal observation and forecast bias. From these plots the 
conclusions can be drawn that the ensemble spread is sta- 
ble throughout the simulation and that ensemble collapse 
or instability do not occur. For the other experiments, sim- 
ilar results were obtained. Figure 12 shows the relationship 


between these ensemble spreads and the simulated discharge. 
The surface storage is only slightly correlated to the total 
discharge, while the spread in the surface and groundwater 
reservoir storages is more strongly correlated to the total dis- 
charge. As can be expected, the standard deviation of the ob- 
servation bias is strongly correlated to the total discharge. 

7.7 Observability of the biases 

An important aspect in the application of the Kalman fil- 
ter is the observability of the system. This can be examined 
through the observability matrix, of which the rank needs to 
be equal to the number of state variables. This observability 
matrix can be written as 

H* 

H*A* 

° k= v\ k K 2 k ... (34) 

n is the number of state variables, and A k is the Jacobian 
matrix, resulting from the linearization of f k ,k-i in Eq. (18). 
Since we are using persistent bias models, with less observa- 
tions than the number of state variables, the rank of the ob- 
servability matrix for the bias vectors will always be smaller 
than the number of state variables. In other words, the use 
of persistent bias models will always lead to an unobserv- 
able system. However, this observability issue is bypassed 
by the use of the parameters y and k in Eq. (16) and k in 
Eq. (17). What these parameters essentially perform is first 
separate the biased background error covariance into an error 
covariance of the unbiased states and an error covariance of 
the forecast bias and then relate the biased background error 
covariance to the observation bias error covariance. If these 
two parameters are well chosen by examining the correla- 
tion length of the innovations, one should expect an adequate 
partitioning of the mismatch between observations and sim- 
ulations into biases and state variables. This is demonstrated 
by the results mentioned in the sections above. The develop- 
ment of realistic bias models is under investigation, but this 
subject falls outside the scope of this paper. 

7.8 Assimilation of in situ discharge data 

In order to demonstrate the applicability of the dual bias esti- 
mation in real-world situations, in situ observed discharge 
rates are assimilated instead of synthetically true values. 
Again, an observation error standard deviation of 0.1 m 3 s -1 
has been assumed, and an assimilation interval of 7 days has 
been used. Persistent bias models are again used. Figure 13 
shows the results of this experiment. A seasonal cycle in the 
observation bias can be seen. This is very realistic, because 
there is a relatively large scatter in the discharge-water level 
relationship that is used to invert the water level observa- 
tions into discharge values (W. Defloor, Department of Oper- 
ational Water Management, Flemish Enviromnental Agency, 
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Intercept : -23.295 
R : 0.983 
RMSE : 23.216 


50 100 150 

True volua (mm) 


S experiment 3 bias unaware 




S2 experiment 1 bias unaware 





S2 experiment 1 bias aware 



S2 experiment 2 bias unaware 



S2 experiment 2 bias aware 



S2 experiment 3 bias unaware 



S2 experiment 3 bios aware 



Fig. 9. Comparison of the estimated storages obtained using the bias-aware and a bias-unaware EnKF for the sinusoidal observation bias 
experiment. The dashed lines are the regression lines; the solid lines the 1 : 1 line. Synthetic true values are in x axis; estimations in y axis. 
The top, middle, and bottom two rows show the results of experiments 1, 2, and 3, respectively. The first, second, and third column show the 
results for 5, S\, and S 2 , respectively. 
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Q experiment 1 bias unaware 


Q experiment 1 bias aware 


Observation bias experiment 1 




Q experiment 3 bias unaware 





Q experiment 3 bias aware 




Time step (days) 
Observation bias experiment 2 



True value (mV) 


True value (m 3 s ') 



Time step (days) 


Fig. 10. Comparison of the estimated discharges obtained using the bias-aware and a bias-unaware EnKF, and the evolution of the observation 
bias throughout the simulation - both for the sinusoidal observation bias experiment. The dashed lines are the regression lines; the solid lines 
the 1 : 1 line. For the bias estimation, the dotted lines refer to the true bias. 


personal communication, 2011). Since the water levels are 
usually low in the summer and high in the winter, this could 
lead to a seasonal signal in the observation bias. The bias 
in the storages evolves to realistic values. The unbiased dis- 
charge estimates and observations also show a slightly bet- 
ter match than the biased values, which increases confidence 
in the obtained results. This real-world application shows, at 
least for this relatively simple model, that the methodology 
can also be used in realistic situations. Further investigation 
using more complicated models in more challenging environ- 
ments (for example the presence of snow and frozen soils) 
is needed. However, this is outside the scope of the paper, 
which focuses on the development of the theoretical frame- 
work and the initial assessment using a simple model. 


8 Discussion and conclusions 

In this paper, we presented a two-stage hybrid Kalman filter 
for the simultaneous estimation of forecast and observation 
biases and model state variables using the EnKF. The filter 
equations were first derived for a linear system. A first step 
consists of the updating of the forecast and observation bi- 
ases, which are then used to update the unbiased state esti- 
mates. The Kalman gains for the forecast bias, the observa- 
tion bias, and the model state variables can be interpreted 
as the fraction of the mismatch between the observations 
and the corresponding model simulations, which can be at- 
tributed to forecast bias, observation bias, and random fore- 
cast and observation error, respectively, remapping this mis- 
match onto state-space for the model state and bias estimates. 
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Table 4. Statistics of the linear regression between the true states and discharges (X) and the values obtained using an observation bias- 
unaware EnKF (K). 


Experiment 

Variable 

Units 

Mean X 

Mean Y 

Slope 

Intercept 

R 

RMSE 

Constant observation and forecast bias 

1 

S 

mm 

98.350 

95.461 

0.985 

-1.464 

0.998 

3.795 

1 

Si 

mm 

11.147 

13.679 

0.991 

2.626 

0.985 

2.603 

1 

s 2 

mm 

0.859 

1.136 

1.006 

0.270 

0.997 

0.292 

1 

Q 

3 —1 
m s 

1.573 

2.065 

0.986 

0.513 

0.997 

0.501 

2 

s 

mm 

117.838 

96.360 

0.984 

-19.625 

0.997 

21.671 

2 

Si 

mm 

11.519 

13.961 

0.980 

2.672 

0.979 

2.543 

2 

S 2 

mm 

0.917 

1.108 

1.030 

0.163 

0.997 

0.217 

2 

Q 

3 -1 
m s 

1.575 

2.067 

0.981 

0.520 

0.997 

0.502 

3 

s 

mm 

117.838 

97.719 

0.989 

-18.916 

0.996 

20.374 

3 

Si 

mm 

11.519 

11.747 

0.994 

0.293 

0.986 

0.621 

3 

S 2 

mm 

0.917 

0.861 

1.041 

-0.093 

0.997 

0.129 

3 

Q 

3 -1 
m s 

1.575 

1.626 

0.990 

0.066 

0.997 

0.117 

Sinusoidal observation and forecast bias 

1 

s 

mm 

96.975 

90.925 

0.988 

-4.897 

0.982 

9.547 

1 

Si 

mm 

11.291 

13.822 

0.990 

2.633 

0.941 

2.835 

1 

S 2 

mm 

0.833 

1.120 

1.062 

0.235 

0.993 

0.336 

1 

Q 

3 —1 
m s 

1.557 

2.061 

1.023 

0.468 

0.991 

0.536 

2 

s 

mm 

116.468 

86.113 

0.983 

-28.457 

0.976 

31.533 

2 

Si 

mm 

11.674 

13.530 

0.976 

2.128 

0.934 

2.280 

2 

S 2 

mm 

0.885 

1.132 

1.076 

0.178 

0.994 

0.303 

2 

Q 

3 —1 
m s 

1.559 

2.057 

1.011 

0.480 

0.990 

0.534 

3 

s 

mm 

116.468 

98.404 

0.996 

-17.613 

0.975 

20.048 

3 

Si 

mm 

11.674 

13.705 

1.006 

1.957 

0.896 

2.690 

3 

S 2 

mm 

0.885 

0.820 

1.108 

-0.162 

0.992 

0.235 

3 

Q 

3 -1 
m s 

1.559 

1.762 

1.027 

0.160 

0.990 

0.283 


The equations were then modified for nonlinear processes 
and observation systems in an ensemble framework. We fol- 
lowed the same approach as in De Lannoy et al. (2007). 
More specifically, a hybrid approach between an EnKF and 
a discrete KF is suggested, in which the state vector is es- 
timated using the EnKF, and the biases are estimated using 
the discrete KF. The unbiased forecast error covariance and 
the forecast bias error covariance are calculated as fractions 
of the biased forecast error covariance. The observation bias 
error covariance is calculated as a multiplication of the ob- 
servation prediction error covariance. In order to apply the 
algorithm to existing data assimilation codes, minimal code 
modification is needed. 

The developed methodology was then applied to a rainfall- 
runoff model in situations with either only observation bias, 
both observation and forecast bias, or only forecast bias. The 
bias-aware EnKF with both observation and forecast bias es- 
timation outperformed the bias-unaware EnKF as well as the 
bias-aware EnKF with forecast bias estimation only. 

The filter depends on two parameters, y and k, which are 
assumed constant in time and which are used to estimate the 


bias error covariances, k has been estimated by examining 
the temporal evolution of the forecast bias. For low values 
of k, a continuous absolute growth of the forecast and ob- 
servation bias was observed. The results were found to be 
less sensitive to the values for y . It can be argued that these 
parameters should be made temporally variable, thus that a 
model for these two parameters would have to be developed. 
A more realistic model for the evolution of the biases could 
also be developed. Although this could certainly be the sub- 
ject of further investigations, this falls outside the scope of 
this paper. 

A next step in the development of the two-stage hybrid fil- 
ter will be the assimilation of remotely sensed data such as 
soil moisture values into a spatially distributed, physically- 
based land surface model. The reliability of the method can 
be assessed by examining the modeled discharge. It is well 
known that a good estimation of surface soil moisture val- 
ues will not necessarily lead to a good estimation of mod- 
eled discharge and vice versa (Lin et al., 1994). As a con- 
sequence, the use of discharge data for the calibration of 
hydrologic models will probably lead to biased surface soil 
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Fig. 12. Relationship between (1) the standard deviations of the a 
priori biased state error covariance and the a priori observation bias 
error covariance, and (2) the simulated discharge. In the two top 
and the left-hand side bottom panels, P in the y axis refers to the a 
priori forecast error covariance. In the bottom right-hand side panel, 
Po refers to the a priori observation bias error covariance. 


Appendix A 


Fig. 11. Time series of the standard deviation of the a priori biased 
state error covariance (top three panels) and the a priori observa- 
tion bias error covariance (bottom panel) for the experiment with 
sinusoidal observation and forecast biases. 

moisture estimates. The reliability of the newly developed 
method can thus be examined by examining the modeled dis- 
charge values. 

This paper presents an initial step towards dealing with the 
complex problem of joint observation and forecast bias and 
state estimation. The overall conclusion, based on the results 
that have been described, is that there is potential to improve 
data assimilation results if both observation and forecast bias 
are taken into account, as opposed to the use of bias-unaware 
Kalman Filters. 


Proof of the unbiased state estimate 

It is relatively straightforward to prove that the last update 
in Eq. (9) leads to unbiased state estimates. We can calculate 
the temporal average of the state estimate error: 

[*t - xt) = (i; -X k - b? + + k, (n - b°+ - h, (i,- - £"*+)) ). (Al) 

The following definitions of the a priori estimate errors are 


used: 

e + 

e k 

= *k - 

Xk 




e t 

= - 

Xk 

e n r 

K~ 

-K 

K 

= K~ - 

-K 


A similar definition is used for the a posteriori estimate error. 
In Eq. (Al) we add and subtract xy to the right-hand side 
term, and use the previous definition: 

(«* ) = (*t - e[ n+ + K, (y k - b°+ - H, (i~ - 6"'+)) )■ (A3) 

Through substitution ofEq. (3) we can write 

(«* ) = (a - *? + + K (h*x* + v k + b° k - b°+ - H t (s* - *'"+)) (A4) 
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Observation 





Fig. 13. Results of the assimilation of in situ observed discharge rates. Left panels: the bias estimates. The top plot shows the results for the 
observation bias. The following three plots show the bias in 5, Si and S 2 . Right panels: the discharges. 


Substitution of Eqs. (A2) and (2) leads to 

( 4 ) = (4 - 4 + + K * (h* (** ~K) + v k - e°+ - H* (4 - ij + )) (A5) 

Rearrangement and again substituting Eq. (A2) result in the 
following expression: 

= ( [«* - < + + K k (-H*e* + v k - e°+ + n k el ,+ ) j. (A6) 

By definition, each individual error component on the right- 
hand side is equal to zero. For example, for the state vector 
we can write 

x k = x k - b’J! . (A7) 

We add and subtract e k and substitute this once by Eq. (A2): 
x k = x k - e k + x^ - Xk - b \ ". (A8) 

This reduces to 

x k = x k - e k - b' k . (A9) 


Comparison to Eq. (2) shows that by definition the estimate 
error average needs to be zero. The temporal average of the 
a posteriori estimate error is thus equal to zero, which means 
that the system estimates an unbiased state. 


Appendix B 


The derivation for a linear system 

B1 Calculation of the observation bias Kalman gain 


First the expression for the Kalman gain for the observation 
bias will be derived. For this purpose, the a posteriori error 
covariance of the observation bias (P£ + ) needs to be mini- 
mized. Substitution of Eq. (9) by Eq. (11) leads to 


{k - k~ - n (n - h k i- k + h ,*r - *r)) 

■ (5 - K~ - K (y k - n k i- k + H - b° k -)) T 


(Bl) 
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Expansion of this product, placing the Kalman gains and 
observation matrices outside the expected value expressions 
(they are constant over a time step), using Eq. (3) for the 
substitution of y k , and knowing that the unbiased states, the 
biases and the random error are independent of each other, 
leads to 

k + = e [(*? - k ) (*? - ft?-) 7 ] - ke [(*? - hi ) (*? - *r) T ] 

~E [(*? - *r) (*? - ft?-) 7 ] K T + K?H t 

E [(it - V) (** - K ) ] H f K f 

+ K?fi[(*?-ft?-) (ft?-i?-) T ]K? T 

+ K?H t £ [(ft?" - b'"-) (ft? 1 - ft?-) 7 ] HfK f + K ?£ f »f] K f. (B2) 

Substitution of Eqs. (11) and (12) results in the following 
expression: 

K + = K~ - KK~ - K ~ Kf + k?h a (p* + pr) 

H l Kf + K? Pf Kf + K° k R, Kf . (B3) 

The minimization of Pf means that the first derivative with 
respect to K° k is equal to zero. Calculation of this first deriva- 
tive and equalizing this to zero lead to 

[h, (p- + P r) H[ + Pf] Kf + R,Kf = Pf . (B4) 

After rearrangement, the Kalman gain for the observation 
bias can be written as 


k = pf [«* (p* + pr) h i + pf + Ri 


(B5) 


Pf = E 


{k - hr) fa-br) 


+ K H/. E 


+ E 
+ KfH k E 
+ K 
+K™ 


(k - K ) (r - K ) 

(K-b'r) 

(*k-*k) (** ~*k) 

(k - b° k -) (b° k - hr) 1 


H[Kf 


H[Kf 


K'f 


+ K 


h kE [(*? - b'r) (*? - hr y 

[nv T k ]Kf. 


it T T 

H k 


(B7) 


Substitution of Eqs. (11) and (12) results in the following 
equation: 


p"'+ _ p'»- 


k'"h a .p r - 

-K-H A p-H[Kf + 

■KrtPfH[Kf 


■prH[Kf 

Km no- | smT 

k F k 

IK” R k Kf. 


(B8) 


Derivation of this expression with respect to K™ and equal- 
ization to zero leads to 

[h* (p * + K~) Hjf + p r] Kf + R A Kf = (B9) 

Rearrangement finally leads to the analytical expression for 
the Kalman gain for the forecast bias: 


K m nm — rjT 

k — ~ V k 


[H * (p* + Pf) H[ + Pf + R*] ' . (BIO) 


B3 Calculation of the Kalman gain of the states 


B2 Calculation of the Kalman gain of the forecast bias 

Similarly as in the derivation of the expression for the obser- 
vation bias Kalman gain, the a posteriori error covariance of 
the forecast bias (P^ i+ ) needs to be minimized. Substitution 
of Eq. (9) by Eq. (11) leads to 


The objective is to estimate the unbiased state vector as accu- 
rately as possible. For this purpose, similarly as in the deriva- 
tion of the expression for the biases Kalman gain, the unbi- 
ased a posteriori error covariance of the model states (Pf 
needs to be minimized. Substitution of Eq. (9) by Eq. (11) 
leads to 


Pf 


= E 


(k - *r - K “ (j* - + H*ftf - *?-)) 

■ (K - K~ - K (h - "f? + H f r - ft ?-)) 7 


. (B6) 


Again, expanding this product, placing the Kalman gains and 
observation matrices outside the expected value expressions, 
using Eq. (3) for the substitution of y A , and knowing that the 
unbiased states, the biases, and the random error are indepen- 
dent of each other, results in the following expression: 


(** - ** + ftf - K (m 
(xk -if • +ftf - k k (j k 


H k~x k + H t .ft?- + - i?+)) 
Hff + Hff -*?+)) T 


■ (Bl 1) 


Similarly as in the derivation of the Kalman gains for the bi- 
ases, this product is expanded, placing the Kalman gains and 
observation matrices outside the expected value expressions. 
Equation (3) is used for the substitution of y A , knowing that 
the unbiased states, the biases, and the random error are in- 
dependent of each other, results in the following expression: 


= E [ft - x k ) ( x k - x k ) r j - K t H t E ft - x k ) ( x k - 

~ E [{x k ~ i~ k ) (x k ~ x~ k ) T ] H l K[ + K Ht 
E [{x k - x-) (x k - x~) T ] H f K[ + Kt 
E [(*? - *? + ) (*? - ft?"*") 7 ] K[ + KkE [„* v T k ] Kf . 


(B12) 
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Substitution of Eqs. (1 1) and (12) leads to 

K = ~ K*H,P- - P, H l K[ + K, H/ P- H [ K[ 

+ K k P''+ K[ + K k R, K [ . (B 13) 

Calculation of the first derivative with respect to K k and 
equalization to zero results in the following expression: 


H,P-H[+P a °+ 


k; +r*k^ =h,p-. 


(B14) 


After rearrangement, the Kalman gain for the unbiased states 
can be written as 


K, = P,7H[ [h a .P“H[ + P°+ + R,] 


(B15) 


B4 Updating of the error covariances 


In the application of the Kalman filter, the error covariances 
need to be updated. This is performed using the previously 
derived expressions for the a posteriori error covariances 
(Eqs. B3, B8 and B13). These equations can be rewritten as 


P+ =Pj -K 1 H i p--p-HjKj+K i [H i p-H[+ Pr + K li]Kl 

K* = K - K K - K Kf + Kt [«i (A + ft) H[ + pj- + Rt] Kf 

P? + = 'T + K» Hi P?- + Pfc "I K1‘ T + Kj f [Hi (p s - + p?-) H l + Pj" + Hi] Kl‘ T 


(B 16) 


Expansion of the product, knowing that the errors in the un- 
biased state estimates, the biases, and the noise are indepen- 
dent of each other, and substituting the expressions for the 
error covariances, leads to: 


P* =A,_iP+_iA[_i 


p»<+ 

r k 


Qa-i 


-Ait- 


— E 


(»?-! - K-,) (k - K + )' 

(k - »r) (*?-i - 


H - 1 • 


(B20) 


We know that b"‘ is equal to b"'_ { and that b"' is equal to 
b™_ [ . This is written in the system description (Eqs. 5 and 8). 
We can thus write the following: 


E 


(Vk-1-tpi) (K-K + y 

= E[(K-i-i>r) (k - *r 

+ K”'(yt-Ht^-+H k b'-~ -b° k ~)) T . 


(B21) 


Substitution of the observation equation (Eq. 3) and rear- 
rangement lead to 


Substitution of the Kalman gains (step 3 in Fig. 1) and leav- 
ing out the factors that cancel each other lead to 



[I K, H/i ] P- 

"i - K k ] p r 

I + K“ H a .] P'"- 


(B17) 


E 


{K-x-K-i) (k-k + ) t 

= K -1 + E [(*?-! - K\) ( 

+ Vk + b° k — H^ 5c k + H,t b™ 


n k x k - ^ K 

, T 


-K-y 


K 


mT 


(B22) 


The minus in the expression for the forecast bias Kalman 
gain explains the plus in the propagation of the forecast bias 
error covariance. The fact that a remapping of the system 
state and the forecast bias to observation space is needed 
explains the V\ k factor in the equations for the system state 
and forecast bias error covariances, while this factor does not 
appear in the observation bias error covariance propagation 
equation. 


B5 Propagation of the state error covariance 


The a posteriori error covariances need to be propagated from 
time step k — 1 to time step k. First, we will propagate the un- 
biased state error covariance. Substituting the definition for 
the forecast bias (Eq. 2) by Eq. (11) leads to 


P 


k 



h m — r 

Du X ,, 




*?-^-+*r) r ](Bi 8 ) 


Substitution of the system equation (Eq. 1), again substitut- 
ing Eq. (2) for simplification, results in 


P 


k 


= E 


(a*-i (xr-i - - b k + K + + w*-i) 

■ (Aj_i (x k -i - - b'£ + b™ + + w k -i^ 


■ (B19) 


We know that the forecast bias is independent of the observa- 
tion bias and of the observation noise. This expression thus 
reduces to 


(K-x-K\) (bZ-K + )‘ 

- K-i + E [( 5 -i - K-i) (h tx k - h kb'l 

-h ,5 + u k b';;-y 


(B23) 


This can be rewritten as 


E 


[(»?-■- in) (K-i>r) r 



k; 


mT 


(B24) 


We know that the unbiased state errors are independent of 
the bias errors, so the last term reduces to zero. We can thus 
write the propagation equation as 


P* = A*_ i P+_ , A[_ j + P5 + Q*_ i . (B25) 


The use of a persistent bias model will lead to a propagation 
of the bias error covariances to the next time step without 
transformation: 
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